<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<link rel="stylesheet" href="../style/journal.css" type="text/css" />
<style type="text/css"><!--
.googleadsense {
	margin: 2px;
	padding: 0px;
//--></style><script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-65008-1";
urchinTracker();
</script><title>批量转网页编码</title>
</head>
<body>
<a href="index.html">Journal</a>(2005) | <a href="../blog/"><b>Blog</b></a>(2006) | <a href="http://www.fayland.org/cgi-bin/random_link.pl">RandomLink</a> | <a href="AboutFayland.html">WhoAmI</a> | <a href="LiveBookmark.html">LiveBookmark</a> | <a href="http://www.fayland.org/">HomePage</a>
<p><&lt;Previous: <a href="050315.html">rt.cpan.org: Bug</a>&nbsp;&nbsp;>>Next: <a href="050320.html">A trip to the Yellow Mountain</a></p>
<h1>批量转网页编码</h1>
<div class='content'>
<p>Category: <a href='Script.html'>Script</a> &nbsp; Keywords: <b>utf-8 gb2312</b></p>转为 utf-8 是为国际接轨。 :) <br>
<a href='http://www.fayland.org/scripts/2utf8.pl.txt'>Download here!</a>

<h2>Code</h2>
<pre>
#!/usr/bin/perl
# convert gb2312 encoding webpage to utf-8 in a directory
use strict;
use warnings;
use Encode qw/from_to/; # load the main func.

# setting
my $dir = 'E:/Fayland/Emag/0503'; # the directory u want to convert.

# get all .html? files
opendir(DIR, $dir);
my @file = readdir(DIR);
closedir(DIR);
@file = grep(/\.html?$/, @file);

# convertion
foreach (@file) {
    # get the file data;
    open(FH, "$dir/$_");
    my @data = <FH>;
    close(FH);
    my $data = join("", @data);
    if ($data =~ /charset\=gb2312/) { # it's not utf-8 yet
        $data =~ s/charset\=gb2312/charset\=utf-8/s;
        from_to($data, "gb2312", "utf8");    
        open(FH, ">$dir/$_");
        print FH $data;
        close(FH);
        print "$_ convert success!\n";
    } else {
        print "$_ is already utf-8\n";
    }
}
</pre></div>
<p><&lt;Previous: <a href="050315.html">rt.cpan.org: Bug</a>&nbsp;&nbsp;>>Next: <a href="050320.html">A trip to the Yellow Mountain</a></p>
<p><strong>Options:</strong> <a href='http://del.icio.us/post?title=%E6%89%B9%E9%87%8F%E8%BD%AC%E7%BD%91%E9%A1%B5%E7%BC%96%E7%A0%81&url=http://www.fayland.org/journal/2utf8.html'>+Del.icio.us</a></p>
<strong>Related items</strong>
<ul><li><a href='utf8.html'>utf8与gb2312编码</a> < <span class='digit'>2004-11-09 12:42:33</span> ></li><li><a href='050115.html'>gb2312.enc</a> < <span class='digit'>2005-01-15 22:57:38</span> ></li></ul>
Created on <span class="digit">2005-03-15 23:16:46</span>, Last modified on <span class="digit">2005-11-08 01:51:57</span><br />
Copyright 2004-2005 All Rights Reserved. Powered by <a href="Eplanet.html">Eplanet</a> && <a href='http://catalyst.perl.org'>Catalyst</a> 5.62.
</body>
</html>